#first-packet latency23/09/2025
VoXtream Starts Speaking From the First Word — Open-Source Full-Stream Zero-Shot TTS for Real-Time Use
'VoXtream is an open-source full-stream zero-shot TTS that begins speaking from the first word, delivering sub-120 ms first-packet latency with compiled GPUs and continuous 80 ms audio frames.'